KrigR - Climate Data For Your Spatial Analyses
An Introduction To The KrigR Package
Setting Things Up
Installing KrigR
First of all, we need to install KrigR. Since we are currently preparing the package for submission to CRAN, it is only available through the associated GitHub repository (https://github.com/ErikKusch/KrigR) for the time being.
Here, we use the devtools package within R to get access to the install_github() function. For this to run, we need to tell R to not stop the installation from GitHub as soon as a warning is produced. This would stop the installation process as early as one of the KrigR dependencies hits a warning as benign as “Package XYZ was built under R Version X.X.X” which can usually be ignored safely.
Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS="true")
devtools::install_github("https://github.com/ErikKusch/KrigR")
library(KrigR)For the sake of this tutorial, we need some extra packages for visualisations. We check if they are already installed, subsequently install (if necessary) and load them with this user-defined function:
install.load.package <- function(x){
if (!require(x, character.only = TRUE))
install.packages(x, repos='http://cran.us.r-project.org')
require(x, character.only = TRUE)
}
package_vec <- c("tidyr", # for turning rasters into ggplot-dataframes
"ggplot2", # for plotting
"viridis", # colour palettes
"cowplot", # gridding multiple plots
"ggmap" # obtaining google maps
)
sapply(package_vec, install.load.package)## tidyr ggplot2 viridis cowplot ggmap
## TRUE TRUE TRUE TRUE TRUE
Before we can proceed, you need to open up an account at the CDS and create an API key by following this link: https://cds.climate.copernicus.eu/api-how-to. This is required so that you may issue download requests through the KrigR package. Once you have done so, please register the user ID and API Key as characters as seen below:
Setting Up Directories
The tutorial is designed to run completely from scratch. For this to work in a structured way, we create a folder/directory structure so that we got some nice compartments on our hard drives for where each separate Kriging process is run. We create the following directories:
- A data directory for all of our individual Kriging processes
- A shapefile directory (located within our data directory) for all of the shapefiles that we will use
Downloading Shapefiles
Here, we download some of the most commonly used shapefiles (for our analyses using KrigR so far, at least). For repeat code-sourcing, we have implemented checks which only trigger the download of a given shapefile set if the data in question is not present in our Shapes directory yet.
This tutorial doesn’t make use of all the shapefiles we download here. We simply thought it prudent to show you what we have found to be useful and how to get your hands on the data in a reproducible way.
Land Cover
As we will see in this tutorial, masking out unnecessary areas from our analyses speeds up Kriging tremendously. Often, this will entail limiting data sets to terrestrial habitats. This land mask here does a terrific job at masking out non-land areas.
#### LAND MASK (for masking covariates according to land vs. sea)
if(!file.exists(file.path(Dir.Shapes, "LandMask.zip"))){ # if not downloaded yet
download.file(paste0("https://www.naturalearthdata.com/http//www.naturalearthdata.com",
"/download/10m/physical/ne_10m_land.zip"),
destfile = paste(Dir.Shapes, "LandMask.zip", sep="/")) # download cultural vector
unzip(paste(Dir.Shapes, "LandMask.zip", sep="/"), exdir = Dir.Shapes) # unzip data
}
LandMask <- readOGR(Dir.Shapes, "ne_10m_land", verbose = FALSE) # read
ggplot() +
geom_polygon(data = LandMask, aes(x = long, y = lat, group = group), colour = "darkred", fill = "black") +
theme_bw() + labs(x = "Longitude", y = "Latitude")States & Provinces
Political divisions are often what policy makers are after. Hence, we also download a shapefile for states and provinces across the Globe.
#### STATE MASK (for within-nation borders)
if(!file.exists(file.path(Dir.Shapes, "StateMask.zip"))){ # if not downloaded yet
download.file(paste0("https://www.naturalearthdata.com/http//www.naturalearthdata.com/",
"download/10m/cultural/ne_10m_admin_1_states_provinces.zip"),
destfile = paste(Dir.Shapes, "StateMask.zip", sep="/")) # download cultural vector
unzip(paste(Dir.Shapes, "StateMask.zip", sep="/"), exdir = Dir.Shapes) # unzip data
}
StateMask <- readOGR(Dir.Shapes, "ne_10m_admin_1_states_provinces", verbose = FALSE) # read
ggplot() +
geom_polygon(data = StateMask, aes(x = long, y = lat, group = group), colour = "darkred", fill = "black") +
theme_bw() + labs(x = "Longitude", y = "Latitude")As you can see, this shapefile provides a set of rather small (geographically speaking) regions. We will use these for showing how Kriging works within this tutorial because Kriging runs in a timely manner for these regions.
Additionally, it is worth pointing out that www.naturalearthdata.com offers a large host of further shapefile data sets including (but not limited to):
- National borders
- Rivers and Lakes
- Urban areas
- Reefs and Bathymetry
We strongly recommend looking there early on in your projects for any shapefile needs you may have.
Ecoregions
When we’re not bound by political boundaries, ecoregions can often help to limit the spatial scope of our macroecological studies to manageable sizes (as far as Kriging effort is concerned).
#### ECOREGIONS (for ecoregional borders)
if(!file.exists(file.path(Dir.Shapes, "WWF_ecoregions"))){ # if not downloaded yet
download.file(paste0("http://assets.worldwildlife.org/publications/15/files/",
"original/official_teow.zip"),
destfile = file.path(Dir.Shapes, "wwf_ecoregions.zip")) # download regions
unzip(file.path(Dir.Shapes, "wwf_ecoregions.zip"), exdir = file.path(Dir.Shapes, "WWF_ecoregions")) # unzip data
}
EcoregionMask <- readOGR(file.path(Dir.Shapes, "WWF_ecoregions", "official", "wwf_terr_ecos.shp"), verbose = FALSE) # read
ggplot() +
geom_polygon(data = EcoregionMask, aes(x = long, y = lat, group = group), colour = "darkred", fill = "black") +
theme_bw() + labs(x = "Longitude", y = "Latitude")This particular data set offers the shapes of biomes and biogeographical realms across the Earth. As far as Kriging with the KrigR package is concerned, we heavily advise against Kriging using biogeographical realms without further consideration since these large regions lead to extreme processing requirements.
Plotting Functions
In order to easily visualise our Kriging procedure including (1) inputs, (2) covariates, and (3) outputs without repeating too much of the same code, we create a few plotting functions of our own here.
Don’t worry about understanding how these work off the bat here. Kriging and the package KrigR are what we want to demonstrate here - not visualisation strategies.
Raw Data
This function plots the raw data that we will krig and exports a single plot of all layers of the input raster:
Plot_Raw <- function(Raw, Shp = NULL, Dates){
Raw_df <- as.data.frame(Raw, xy = TRUE) # turn raster into dataframe
colnames(Raw_df)[c(-1,-2)] <- Dates # set colnames
Raw_df <- gather(data = Raw_df, key = Values, value = "value", colnames(Raw_df)[c(-1,-2)]) # make ggplot-ready
Raw_plot <- ggplot() + # create a plot
geom_raster(data = Raw_df , aes(x = x, y = y, fill = value)) + # plot the raw data
facet_wrap(~Values) + # split raster layers up
theme_bw() + labs(x = "Longitude", y = "Latitude") + # make plot more readable
scale_fill_gradientn(name = "Air Temperature [K]", colours = inferno(100)) # add colour and legend
if(!is.null(Shp)){ # if a shape has been designated
Raw_plot <- Raw_plot + geom_polygon(data = Shp, aes(x = long, y = lat, group = group), colour = "black", fill = "NA") # add shape
}
return(Raw_plot)} # export the plotCovariates
This function plots the covariate data we will be using by showing each variable on both resolution levels side-by-side:
Plot_Covs <- function(Covs, Shp = NULL){
Plots_ls <- as.list(rep(NA, nlayers(Covs[[1]])*2)) # create as many plots as there are covariates variables * 2
Counter <- 1 # counter for list position
for(Variable in 1:nlayers(Covs[[1]])){ # loop over all covariate variables
Covs_Iter <- list(Covs[[1]][[Variable]], Covs[[2]][[Variable]]) # extract the data for this variable
for(Plot in 1:2){ # loop over both resolutions for the current variable
Cov_df <- as.data.frame(Covs_Iter[[Plot]], xy = TRUE) # turn raster into data frame
colnames(Cov_df)[3] <- "Values" # assign a column name to the data column
Plots_ls[[Counter]] <- ggplot() + # create plot
geom_raster(data = Cov_df , aes(x = x, y = y, fill = Values)) + # plot the covariate data
theme_bw() + labs(x = "Longitude", y = "Latitude") + # make plot more readable
scale_fill_gradientn(name = names(Covs[[Plot]]), colours = cividis(100)) + # add colour and legend
theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) # reduce margins (for fusing of plots)
if(!is.null(Shp)){ # if a shape has been designated
Plots_ls[[Counter]] <- Plots_ls[[Counter]] + geom_polygon(data = Shp, aes(x = long, y = lat, group = group), colour = "black", fill = "NA") # add shape
}
Counter <- Counter + 1 # raise list counter
} # end of resolution loop
} # end of variable loop
ggPlot <- plot_grid(plotlist = Plots_ls, ncol = 2, labels = "AUTO") # fuse the pplots into one big plot
return(ggPlot)} # export the plotKriged Products
This function plots the Kriging outputs by splitting them up according to whether they are Kriging predictions or the uncertainties associated with them:
Plot_Krigs <- function(Krigs, Shp = NULL, Dates){
Type_vec <- c("Prediction", "Standard Error") # these are the output typess of krigR
Colours_ls <- list(inferno(100), rev(viridis(100))) # we want separate colours for the types
Plots_ls <- as.list(NA, NA) # this list will be filled with the output plots
for(Plot in 1:2){ # loop over both output types
Krig_df <- as.data.frame(Krigs[[Plot]], xy = TRUE) # turn raster into dataframe
colnames(Krig_df)[c(-1,-2)] <- paste(Type_vec[Plot], Dates) # set colnames
Krig_df <- gather(data = Krig_df, key = Values, value = "value", colnames(Krig_df)[c(-1,-2)]) # make ggplot-ready
Plots_ls[[Plot]] <- ggplot() + # create plot
geom_raster(data = Krig_df , aes(x = x, y = y, fill = value)) + # plot the kriged data
facet_wrap(~Values) + # split raster layers up
theme_bw() + labs(x = "Longitude", y = "Latitude") + # make plot more readable
scale_fill_gradientn(name = "Air Temperature [K]", colours = Colours_ls[[Plot]]) + # add colour and legend
theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) # reduce margins (for fusing of plots)
if(!is.null(Shp)){ # if a shape has been designated
Plots_ls[[Plot]] <- Plots_ls[[Plot]] + geom_polygon(data = Shp, aes(x = long, y = lat, group = group), colour = "black", fill = "NA") # add shape
}
} # end of type-loop
ggPlot <- plot_grid(plotlist = Plots_ls, ncol = 1, labels = "AUTO") # fuse the pplots into one big plot
return(ggPlot)} # export the plotKriging - the three steps
Using KrigR in this way has us use the functions download_ERA(), download_DEM(), and krigR(). Running these functions by themselves as opposed to executing the KrigR pipeline (which we cover later in this tutorial) gives you the most control and oversight of the KrigR workflow.
We will start with a simple Kriging process and subsequently make it more sophisticated.
By Extent
The most simple way in which you can run the functions of the KrigR package is by specifying a rectangular bounding box (i.e. an extent) to specify your study region(s). Let me show you how that changes the workflow and computational effort for the Saxony example. Here, we will run a small downscaling exercise of my birth-state of Saxony. It is a medium-sized state in the east of Germany and lends itself to some nice statistical downscaling since it presents us with mountainous regions along the south-eastern border and flatland areas towards the north-western edges.
Here’s the full area we will be obtaining and downscaling data for:
Extent <- extent(c(11.8, 15.1, 50.1, 51.7)) # roughly the extent of Saxony
GoogMap <- get_map(c(left = Extent[1],
right = Extent[2],
bottom = Extent[3],
top = Extent[4]))
ggmap(GoogMap) +
theme_bw() + labs(x = "Longitude", y = "Latitude") Here, you can see a map of the region with the mountains of the Erzgebirge along the southern border to the Czech Republic as well as the flat terrains around Leipzig in the north of Saxony.
Now, let’s create a separate directory within which we will store the raw ERA5-land data, GMTED2010 DEM covariate data, and Kriging outputs:
Climate Data
No we are ready to carry out our first download of climate reanalysis data through the KrigR package!
For this part of the tutorial, we download air temperature for a three-day interval around my birthday (03-01-1995) using the extent of my home-state of Saxony.
Notice that the downloading of ERA-family reanalysis data may take a short while to start as the download request gets queued with the CDS of the ECMWF before it is executed.
State_Raw <- download_ERA(
Variable = '2m_temperature',
DataSet = 'era5-land',
DateStart = '1995-01-02',
DateStop = '1995-01-04',
TResolution = 'day',
TStep = 1,
Extent = Extent,
Dir = Dir.StateExt,
API_User = API_User,
API_Key = API_Key)
Plot_Raw(State_Raw, Dates = c("02-01-1995", "03-01-1995", "04-01-1995"))Now, let’s look at the raster that was produced:
## class : RasterBrick
## dimensions : 17, 34, 578, 3 (nrow, ncol, ncell, nlayers)
## resolution : 0.1, 0.1000001 (x, y)
## extent : 11.75, 15.15, 50.05, 51.75 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## names : index_1, index_2, index_3
## min values : 267.8741, 266.8182, 262.5400
## max values : 273.1816, 271.9754, 269.3639
As you can see, we obtained a RasterBrick object with 3 layers of data (one for each day we are interested in). Additionally, as could be gleamed from the plots above, it is apparent that my home-state got a lot cooler on the day after my birth (04/01/1995) when compared to the two preceding days. What to make of this, I leave up to you because I don’t think I like the interpretation myself if we were to assume causality here.
Keep in mind that every function within the KrigR package produces NetCDF (.nc) files in the specified directory (Dir argument in the function call) to allow for further manipulation outside of R if necessary (for example, using Panoply).
Covariates
Next, we use the download_DEM() function which comes with KrigR to obtain elevation data as our covariate of choice. This produces two rasters:
- A raster of training resolution which matches the input data in all attributes except for the data in each cell
- A raster of target resolution which matches the input data as closely as possible in all attributes except for the resolution (which is specified by the user) and the data in each cell
Both of these products are bundled into a list where the first element corresponds to the training resolution and the second element contains the target resolution covariate data. Here, we specify a target resolution of .02.
Covs_ls <- download_DEM(Train_ras = State_Raw,
Target_res = .02,
Dir = Dir.StateExt,
Keep_Temporary = TRUE)
Plot_Covs(Covs_ls)Alternatively, you can specify a different raster which should be matched in all attributes by the raster at target resolution. We get to this again at a later point in this tutorial.
For now, let’s simply inspect our list of covariate rasters:
## [[1]]
## class : RasterLayer
## dimensions : 17, 34, 578 (nrow, ncol, ncell)
## resolution : 0.1, 0.1000001 (x, y)
## extent : 11.75, 15.15, 50.05, 51.75 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## names : DEM
## values : 75.02774, 879.5154 (min, max)
##
##
## [[2]]
## class : RasterLayer
## dimensions : 102, 204, 20808 (nrow, ncol, ncell)
## resolution : 0.01666667, 0.01666667 (x, y)
## extent : 11.74986, 15.14986, 50.04986, 51.74986 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## names : DEM
## values : 46.5, 1150.25 (min, max)
This set of covariates has 578 and 20808 cells containing values at training and target resolution, respectively.
Kriging
Now let’s krig these data:
KrigStart <- Sys.time()
State_Krig <- krigR(Data = State_Raw,
Covariates_coarse = Covs_ls[[1]],
Covariates_fine = Covs_ls[[2]],
Keep_Temporary = FALSE,
Cores = 1,
FileName = "State_Ext.nc",
Dir = Dir.StateExt)
KrigStop <- Sys.time()
Plot_Krigs(State_Krig, Dates = c("02-01-1995", "03-01-1995", "04-01-1995"))There we go. All the data has been downscaled and we do have uncertainties recorded for all of our outputs. As you can see, the elevation patterns show up clearly in our kriged air temperature output. Furthermore, you can see that our certainty of Kriging predictions drops on the 04/01/1995 in comparison to the two preceding days. However, do keep in mind that a maximum standard error of 0.206, 0.215, 0.333 (for each layer of our output respectively) on a total range of data of 6.543, 6.289, 7.657 (again, for each layer in the output respectively) is evident of a downscaling result we can be confident in.
Now, what does the output actually look like?
## $Kriging_Output
## class : RasterBrick
## dimensions : 102, 204, 20808, 3 (nrow, ncol, ncell, nlayers)
## resolution : 0.01666667, 0.01666667 (x, y)
## extent : 11.74986, 15.14986, 50.04986, 51.74986 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## names : var1.pred.1, var1.pred.2, var1.pred.3
## min values : 266.8200, 265.7288, 261.5947
## max values : 273.3627, 272.0174, 269.2518
##
##
## $Kriging_SE
## class : RasterBrick
## dimensions : 102, 204, 20808, 3 (nrow, ncol, ncell, nlayers)
## resolution : 0.01666667, 0.01666667 (x, y)
## extent : 11.74986, 15.14986, 50.04986, 51.74986 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## names : var1.stdev.1, var1.stdev.2, var1.stdev.3
## min values : 0.1641075, 0.1778261, 0.2574695
## max values : 0.2059327, 0.2150844, 0.3331758
As output of the krigR() function, we obtain a list of downscaled data as the first element and downscaling standard errors as the second list element.
Finally, let’s see how long it took to carry out this shapefile-informed downscaling:
## Time difference of 36.93623 secs
Train_res set to a raster –>